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ABSTRACT: 

Investigating the gene expression networks that regulate cancer initiation and development is crucial but remains 
mostly incomplete. With the innovations in RNA-sequencing technologies and computational biology, long 
noncoding RNAs (IncRNAs) are being known and characterized at a speedy pace. Recent findings reveal that 
IncRNAs are involved in serial steps of cancer development. These IncRNAs act with DNA, RNA, protein molecules 
and/or their combos, acting as a vital regulator in chromatin organization, as well as transcriptional and post- 
transcriptional regulation. Their aberrant expression confers the neoplastic cell capacities for neoplasm initiation, 
growth, and metastasis. Here we emphasize their aberrant expression and performance in cancers. We found that 
Chromosome 11 consists of the highest number of IncRNA specific to different 14 cancers related to obesity. Among 
these cancers, breast cancer has the highest number of IncRNA associated with it. The interacting partners of the 
IncRNAs were analysed and domain specific interactions were studied. The results showed that the IncRNAs H19 


and MALAT1 can act as potential biomarkers for different cancers. 
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INTRODUCTION: 

In recent studies, it has been seen that only a 
small fraction of the human genome consists of 
protein coding genes (-15%) and the 
remaining fraction do not encode for any 
protein. [1]. The proportion of non-coding 
proteins proportionally correlates to the 
complexity of the organisms [2]. Among this 
class of non-coding proteins, we have long non- 
coding RNAs (IncRNAs) that have a length of 
more than 200 nucleotides. It belongs to a 
diverse class of transcribed RNA molecules that 
do not encode proteins and instead have a 
regulatory, catalytic or structural role [8]. 
LneRNAs are transcribed from intergenic or 
intragenic regions or from some specific 
chromosomal region. Transcriptional and 
epigenetic factors both control the expression of 
IncRNAs [4]. 


Emerging studies have shown that IncRNAs 
play crucial roles in a broad range of biological 
processes and are associated with a number of 
diseases, like cancer, cardiovascular disease, 
and neurodegeneration diseases. Mutation or 
dysregulations in the expression of IncRNA led 
to development of various complex diseases [5]. 
Thus, these IncRNAs are becoming critically 
important for the understanding of several 


important aspects of life sciences. IncRNAs act 
as drivers of tumor suppressive and oncogenic 
functions in some widely occurring cancers, 
like breast and prostate cancer. 

In this work, we aim to highlight the emerging 
impact of neRNAs in cancer research, with a 
particular focus on the mechanisms and 
functions of IncRNAs. These studies on IncRNA 
have demonstrated the importance of the non- 
protein coding part of the human genome in 
carcinogenesis, representing the cutting edge of 
cancer research and focusing on cancer 
suppression. Studies have just begun 
concerning the identity, function, and 
dysregulation of IncRNAs in cancer, and recent 
data suggest that they may serve as master 
drivers of carcinogenesis. Increased research 
on these RNAs will lead to a greater 
understanding of cancer cell function, thereby 
leading to novel clinical applications in 
oncology. 


MATERIALS AND METHODS: 

1. Selection of Cancer Type 

Overweight or being obese is linked to a higher 
risk of 14 different types of cancer, as identified 
in a 2016 review from a working group 
assembled by the International Agency for 
Research on Cancer. In the U.S. each year, about 
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28,000 new cancer diagnoses in men and 
72,000 in women are due to overweight or 
obesity, according to the National Cancer 
Institute (NCD (http://www.health.com/breast- 
cancer/obesity-overweight-cancer#01 - 
overweight-cancer-risk) 
(https:/Avww.werf.org/int/cancer-facts- 
figures/data-specific- cancers). The list of 
cancers contains Breast Cancer, Colon Cancer, 
Endometrial Cancer, Esophageal Cancer, 
Gallbladder Cancer, Kidney Cancer, Liver 
Cancer, Meningioma, Multiple Myeloma, 
Pancreatic Cancer, Stomach Cancer, Thyroid 
Cancer, Prostate Cancer. 


2. Collection of IncRNA 

A list of InecRNA specific to particular cancer 
was collected from LneRNA Disease Database 
2017 (http://Awww.cuilab.cn/Inernadisease). The 
experimentally supported IncRNA-disease 
association data; (Genome builds used for re- 
mapping in this version: hg38). The total 
database consisted of 888 IncRNAs. The specific 
14 cancers consisted of 269 IncRNAs. 


3. Chromosomal location analysis of IncRNAs 
The genomic locations on the chromosome 
were obtained from the database which 
specified location of different IncRNAs in 
different chromosomes. Number of LncRNA 
present in single chromosomes was noted. 


4, Identification of lncRNA interacting proteins 
involved in different cancers 

Interacting protein partners of IncRNAs 
associated with different cancers are collected 
from RAIN’ Database [10] and_ the 
experimentally supported IncRNA interaction 
data (2015 version) from LncRNA Disease 
Database [6 - 8]. The proteins involved in 
different IncRNA were identified using 
keywords, “IncRNA name” from _ online 
databases, such as RAIN, IncRNA interaction 
data, and literature survey. Relevant published 
articles about the candidate genes were 
collected using the PubMed _ database 
(http://www.nebi.nlm.nih.gou/pubmed). The 
genes selected were primarily based on scores. 
From the different proteins available in RAIN; 
genes were grouped based on the score and 
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genes of interest. The names of proteins were 
deduced from UniProt through GeneName. 


5. Domain analysis of IncRNA interacting 
proteins involved in different cancers 

Domain analysis was done in order to check if 
there was any overlapping domain for 
interacting protein partners specific to 
particular IncRNA, using Pfam [11]. After that, 
the pairwise score was calculated between each 
domain of the interacting proteins, using 
EMBOSS8 pairwise alignment tool 
(https://www.ebi.ac.uk/Tools/psa/). 


6. Analysis of RNA expression level of selected 
interacting proteins 

Gene expression profiling and in_ situ 
hybridization studies have revealed that 
IncRNA expression is developmentally 
regulated, can be tissue- and cell-type specific, 
and can vary temporally, or in response to 
stimuli. Many IncRNAs are expressed in a more 
tissue-specific fashion and with greater 
variation between tissues compared to protein- 
coding genes. RNA Expression level is analyzed 
from The Human Protein atlas-which aims to 
map the human proteins in cells, tissues and 
organs using integration of various omics 
technologies [9]. The Human Protein Atlas 
consists of three sub-atlases: The Tissue Atlas, 
The Cell Atlas, The Pathology Atlas. 


RESULTS AND DISCUSSION: 

Some IncRNAs have been found to be 
associated with many types of cancers. For 
example, by analyzing 14 cancer tissues, we 
found that the small nucleolar RNA host gene 
16 (SNHG16) has high expression in bladder, 
lung cancer. Two IncRNAs named MALAT7 and 
H19 were found to be oncogenes in lung cancer, 
breast cancer and other cancer types. Both of 
them need to interact with microRNAs to 
execute functions. For example, mik-138, which 
inhibits the expression of HMGA2 was targeted 
by H79. H19 and MALAT1 were found in 
maximum cancer types compared to other 
IncRNAs. Thus, H19 and MALAT1 have been 
used for further analysis. 
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Fig. 1. Heatmap of cancers and IncRNAs originating from different chromosomes. Name of the cancers given in 
the Table 1. 


Table 1. Total number of IncRNAs in 14 cancers 


Gallbladder cancer 


Kidney Cancer 


Thyroid Cancer 


Stomach Cancer 
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Table 2. Interacting protein partners of the lncRNAs H19 and MALAT1 
Interacting Proteins 
METTLI6 
DENR 
HSPB8 
AKAP5S 
MMP8 


H19 


HCN2 
HCN4 
MALATI1 
HCN3 
ATOH7 
PEXSL 


Table 3(a). Pairwise distance scores between domains of the interacting protein partners for H19 


METTL16 HSPB8 DENR AKAP5 
METTL16 ) 12 22 11 
HSPB8 ) 12 8 
DENR ) Fi 
AKAPS5 O 


Table 3(b). Pairwise distance scores between domains of the interacting protein partners for MALAT1 


ATOH7 HCN2 HCN3 HCN4 MMP8 PEXSL 
ATOH7 O 11 2 1 15.5 20 
HCN2 O 1167.5 1246 12 5 
HCN3 O 1228.5 16 5 
HCN4 O 5 5 
MMP8 O 21.5 
PEXSL O 
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Table 4. RNA expression analysis of IncRNA interacting proteins 


Interacting Protein 
partners 


IncRNAs 


MMP8 
HCN2 
HCN4 
HCN3 


1. Chromosomal location of the IncRNAs 

By analysing the location of the IncRNAs on 
different chromosomes, it is predicted that 
chromosome 11 consists of the highest number 
(84) of IncRNA specific to different cancers 
related to obesity. Chromosome Y (Chromosome 
24) has no IncRNA. The result shown in the 
heatmap below (Fig.1) could be further helpful 
for the researchers to analyze the specific 
functionality of lncRNA and their location. 


2. Interaction of IncRNA with other proteins 
involved in cancer related to obesity 

The selected InRNA interacting proteins were 
retrieved from RAIN database and through 
literature search. Over 300 different proteins 
have been demonstrated to have direct 
association with the IncRNA relation to 14 
cancers related to obesity. Here H19 and 
MALAT1 were selected as IncRNA of interest 
and, therefore, the proteins associated with 
these IncRNA were given prior importance. 
Table 1 lists the total number of IncRNAs for a 
particular cancer type. Breast cancer has the 
maximum number of lIncRNAs_ whereas 
meningioma and stomach cancer has just 2 
IncRNAs associated with it. 


Two IncRNAs (H19 and MALAT1) that were 
found in maximum cancer types were 
analysed to find their interacting proteins. 10 


RPKM Value of Breast 
Cancer RNAs 


RPKM Value of Colon 
Cancer RNAs 


proteins were selected on the basis of RAIN 
interaction. Table 2 depicts the interacting 
proteins selected for the study of domain and 
their related functions. 


3. Overlapping domains between the 
interacting proteins 

Domain region and the domain sequence of 
the interacting proteins were found. Then, the 
pairwise distance score was calculated. 


From Table 3(a) we can find that out of all the 
interacting proteins, domains of DENR and 
METTLI6 have the highest pairwise distance 
score. This suggests there might be overlapping 
domains between these proteins and those are 
interacting with the IncRNA H19. 


Table 3(b) suggests that out of all the 
interacting proteins, domains of HCN2 and 
HCN4 as well as HCN3 and HCN3 have the 
highest pairwise distance score. Thus we can 
say that there might be overlapping domains 
between these proteins and those are 
interacting with the IncRNA MALAT1. 


4. RNA expression of IncRNA interacting 
proteins 

Gene expression profiling and in_ situ 
hybridization studies have revealed that 
IncRNA expression is developmentally 
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regulated, can be tissue- and cell-type specific, 
and can vary temporally, or in response to 
stimuli. Many IncRNAs are expressed in a more 
tissue-specific fashion and with greater 
variation between tissues compared to protein- 
coding genes. Considering the GTEx Dataset, 
RNA expression data is reported as median 
RPKM (reads per kilobase per million mapped 
reads) generated by the GTEx datasets. This 
dataset consisted of 214 samples for Breast 
Cancer and 3845 for colon cancer. RNA 
expression data of lncRNA interacting protein 
partners are presented in Table 4. While the 
expression of some interacting partners is 
higher, some have lower expressions as well. 


CONCLUSION: 

In recent studies, lncRNA has been proved to be 
a powerful regulator of adipocyte 
differentiation and gene expression. Blncl 
(brown fat IncRNA) was identified as a 
conserved IncRNA regulator of brown and beige 
adipocyte differentiation [12]. Again, a long 
non-coding RNA, HOTAIR was expressed in 
gluteal adipose and may regulate key 
processes in adipocyte differentiation [13]. 


From the above analysis, it is concluded that the 
IncRNAs H19 & MALATI which is common in 
both colon cancer & breast cancer, is expressed 
differently in different tissues. The expression 
level of IncRNAs is much higher in breast 
cancer and comparatively lower in colon 
cancer. The endogenous H79 gene is frequently 
abundant in Breast cancer patients, and the 
overexpression of H79 plays an important role 
in Breast cancer development. Since that 
discovery, accumulating reports of 
dysregulated expression of IncRNAs in Breast 
cancer have suggested that aberrant IncRNAs 
are involved in all stages of Breast cancer. 
Relative expression of MALAT1 was seen to be 
increased in Breast cancer tissues and cell lines. 


One of the most abundant IncRNAs, whose 
expression alters during numerous cancers is 
MALATI1 (metastasis-associated lung 
adenocarcinoma transcript 1). It is a highly 
conserved InecRNA and it exhibits an 
uncommon 3’-end processing [14]. It was found 


—— 


c 


that when the MALAT1 gene is knocked down in 
the mouse mammary carcinoma model, there 
was slower tumor growth and metastasis rate 
had reduced. Also there were alterations in 
splicing patterns and thus changes in 
expression of the genes involved in 
protumorigenic signaling pathways [15]. 
MALAT1 has also been proposed to be a 
potential biomarker of colorectal cancer as the 
MALATI expression was upregulated 2.26 times 
in the Colorectal cancer tissues as compared to 
the non-cancerous tissues [16]. Besides cancer, 
MALAT1 abnormal upregulation was seen in 
diabetes-induced microvascular dysfunction 
[17]. Thus, several studies in-vivo have been 
made on MALAT1 which shows that it is a 
potential IncRNA which can be targeted in 
several types of cancer. 


H19 is another abundantly expressed IncRNA 
that has been implicated in human genetic 
disorders and cancer. This IncRNA_ is 
developmentally regulated and is highly 
expressed in fetal tissues and adult muscles 
[18]. H19 was also found to be a valuable target 
for breast cancer therapy as it is an estrogen- 
inducible gene and helps in the breast cancer 
cell survival and proliferation [19]. H19 also 
plays a role in gastric cancer by competing with 
the microRNA (MiR-141) which suppresses 
malignancy in human cancer [20]. A lot of 
research in IncRNA has shown H19 and 
MALATI to be major genes leading to aberrant 
gene expression and cancer. Thus, H19 and 
MALAT1 could serve as potential biomarkers for 
early diagnosis of cancers that are found to be 
related to obesity. 


Here we have analysed the biological functions 
of IncRNAs in different cancers related to 
obesity. Newly discovered IncRNAs have been 
found to have a huge impact on cancer 
development and progression. Future studies 
involving the motifs, secondary structure or 
tertiary structure of lncRNAs as well as the gene 
regulatory network involving IncRNAs will help 
us to understand them better and establish 
more effective strategies for these IncRNA as a 
potential biomarker for early clinical diagnosis, 
prognosis and therapeutics in patients 
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suffering from cancer. However, a lot of clinical 
samples will be required for further studies to 
confirm how specific and sensitive these 
IncRNAs are as clinical biomarkers. 
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